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I. INTRODUCTION 

Entanglement is a fragile feature of composite quantum 
systems that can easily diminish by uncontrollable inter- 
actions with the environment. At the same time however 
carefully crafted entangled states can protect quantum 
coherence from the deleterious effects of those random 
interactions. This idea underlies the principles of quan- 
tum error correcting codes that strengthen the optimism 
regarding the feasibility of implementing in practice com- 
plex quantum information processing tasks 

In this paper we demonstrate how quantum entangle- 
ment can help in the task of classical communication. To 
this end, we develop a simple model of a noisy commu- 
nication channel, where the noise affecting consecutive 
transmissions is correlated. Within this model, we de- 
rive bounds on the classical channel capacity assuming 
either separable or entangled input states, and we show 
that using collective entangled states of transmitted par- 
ticles leads to an enhanced capacity of the channel. 

The motivation for our model comes from classical 
fiber optic communications 0. In practice, light trans- 
mitted through a fiber optic link undergoes a random 
change of polarization induced by the birefringence of the 
fiber. The fiber birefringence usually fluctuates depend- 
ing on the environmental conditions such as tempera- 
ture and mechanical strain. At first sight, this makes the 
polarization degree of freedom unsuitable for encoding 
information, as the input polarization state gets scram- 
bled on average to a completely mixed state. However, 
the birefringence fluctuations have a certain time con- 
stant which means that the transformation of the polar- 
ization state, though random, remains nearly the same 
on short time scales. Consider now sending a pair of 
photons whose temporal separation lies well within this 
time scale. Although the polarization state of each one 
of the photons when looked at separately becomes ran- 
domized, certain properties of the joint state remain pre- 
served. For example, this is the case of the relative po- 
larization of the second photon with respect to the first 



one. We can therefore try to decode from the output 
whether the input polarizations were mutually parallel 
or orthogonal. This property cannot be determined per- 
fectly, as in general we cannot tell whether two general 
quantum states are identical or orthogonal if we do not 
know anything else about them ||, but even the ability 
of providing a partial answer establishes correlations be- 
tween the channel input and output that can be used to 
encode information into the polarization degree of free- 
dom. The situation becomes even more interesting when 
we allow for entangled quantum states. Then the singlet 
polarization state of the two photons, when sent as the 
input, remains invariant under such perfectly correlated 
depolarization, and it can be discriminated unambigu- 
ously against the triplet subspace. Therefore we can en- 
code one bit of information into the polarization state of 
two photons by sending either a singlet state or any of 
the triplet states. We shall see that these simple obser- 
vations will also emerge from our general analysis of the 
channel capacity. 

The first example of entanglement-enhanced informa- 
tion transmission over a quantum channel with correlated 
noise has been recently analyzed by Macchiavello and 
Palma 4]- Our model assumes a different form of corre- 
lations, and its high degree of symmetries has allowed us 
to perform optimization of the channel capacity over ar- 
bitrary input ensembles. Although we analyze only zero- 
and one-photon signals, we define the action of the chan- 
nel in terms of the transformations of the bosonic anni- 
hilation operators, which sets up a framework for possi- 
ble generalizations, such as use of multiphoton signals. 
This application of entanglement in classical communi- 
cation is a distinct problem from entanglement-assisted 
classical capacity of noisy quantum channels studied by 
Bennett et al. in Ref. where it has been shown that 
prior entanglement shared between sender and receiver 
can increase the classical capacity. We also note that the 
non-zero time constant of phase and polarization fluc- 
tuations can be used in robust protocols for long-haul 
quantum key distribution [|| Q ■ 

Before passing on to a detailed discussion of the prob- 
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lem in the subsequent sections, let us introduce some ba- 
sic notation. The action of a channel is described by 
a completely positive map that we will denote by 
A(-). The sender selects messages from an input ensem- 
ble {pi,Qi}, where pi is the probability of sending the 
state Qi through the channel. The capacity of the chan- 
nel is a function of the mutual information between the 
input ensemble and measurement outcomes at the receiv- 
ing stations: it characterizes the strength of correlations 
between these two that arc preserved by the channel. The 
mutual information itself involves a specific measurement 
scheme; however, it has a very useful upper bound in the 
form of the Holevo quantity that depends only on the 
output ensemble of states {pi, A(&)} emerging from the 
channel @: 

x = S fe>A(&)^ - £>S(A(«)) (1) 

where S is the von Neumann entropy S(g) = 
— Tr(plog 2 f?). As we will see, in our model the Holevo 
quantity will provide a tight bound on the mutual infor- 
mation that could be achieved in practice using a simple 
measurement scheme. The classical channel capacity is 
obtained by assuming arbitrarily long sequences of possi- 
bly entangled input systems, and calculating the average 
capacity per single use of the channel. In our analysis, 
we will perform a restricted optimization by considering 
only two consecutive uses of the channel. 

II. CHANNEL DECOMPOSITION 

We will start our discussion by proving a rather gen- 
eral lemma about channels that can be decomposed into 
a direct sum of maps acting on subspaces of the Hilbcrt 
space of the input systems. In physical terms, such chan- 
nels remove quantum coherence between the components 
of the input state that belong to different subspaces, by 
zeroing the respective off-diagonal blocks of the density 
matrix characterizing the input state. This lemma will 
greatly simplify our further calculations. 

Lemma 1: Suppose that we can decompose the Hilbert 
space TL of the system into a direct sum of subspaces 

W = 0#» (2) 

ft 

such that for an arbitrary input state g the state emerging 
from the channel A(g) can be represented as 

A(?)=®AW(gW) (3) 

ft 

where g^ — g\uw i s the input state g truncated to the 
subspace li}- k \ and each A( fc ) is a certain trace-preserving 
completely positive map acting in the corresponding sub- 
space 7i.( k \ Then the optimal channel capacity can be 



attained with an ensemble in which each state belongs to 
one of the subspaces 7v- k ^ . 

Proof: Indeed, suppose that there is a state g that 
does not satisfy the above condition, i.e. it is defined 
on more that one subspace Jv- k > . We can replace it by a 
sub-ensemble {Tr(p( fe '); g^ /Tr(p( fe ))}, obtained by trun- 
cating the state g to the subspaces TL^ and normalizing 
the resulting density matrices. In other words, whenever 
the sender is supposed to transmit g, she replaces it by 
one of the normalized truncated states g^ /Ti(g^) with 
the corresponding probability Tr(p( fc )). It is straightfor- 
ward to verify that the average state obtained from such 
a subensemble is identical with A(g). 

The above observation has a useful consequence when 
optimizing the Holevo bound on channel capacity. If the 
input ensemble is of the form discussed above, then it 
can be split into subensembles of states that belong to 
separate subspaces Ti^ k \ with the probability distribu- 
tions normalized to one within each subensemble, and pk 
denoting the probability of sending a state from the fcth 
subensemble. It is then easy to check that the Holevo 
quantity is given by the following expression: 

X = ^lpkX {k) -J2P klo ^2Pk, (4) 

k k 

where x is the Holevo quantity for the kth subensem- 
ble. Therefore, the maximization of the Holevo quantity 
can be performed in two steps. The first one is the op- 
timization of each of \^ separately, assuming an input 
ensemble restricted to the subspace 7v- k '. The second 
step consists of optimizing the probability distribution 
Pk with the normalization constraint ^2 k Pk — 1, and it 
can be performed explicitly using the method of Lagrange 
multipliers. Indeed, if we denote the Lagrange multiplier 
as A, then differentiation over pi yields: 




-A. (5) 



This formula allows us to express the probabilities pi in 
terms of the Lagrange multiplier A as: 

p;=2 X< i) -l/ 1 n2-A j (g) 

and furthermore summation over I and using the fact 
that J^i Pi — 1 gives the value of the Lagrange multiplier 
as: 

Finally, inserting Eqs. © and J7J into Eq. (@J yields the 
maximum value of the Holevo quantity equal to: 
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FIG. 1: Representation of two consecutive temporal slots la- 
belled by A and B. The Hilbert space of each slot is spanned 
in our model by three states: the zero-photon state |0) and 
two mutually orthogonal polarization states denoted by | «-+) 
and | I). 

We will later find this expression useful in calculating 
the channel capacity in our model. The physical reason 
for this is that we will be able to decompose the set of 
states used for communication into subensembles with a 
fixed number of photons, and then optimize the Holevo 
quantity separately in each subspace. 

III. DEPOLARIZATION MODEL 

Let us now introduce a mathematical model for the 
random transformation of polarization during transmis- 
sion through the channel. A general linear transforma- 
tion between two annihilation operators corresponding 
to a pair of orthogonal modes is given by 2 x 2 unitary 
matrices that form the Lie group U(2). In situations 
when only the relative phase between the two polariza- 
tion modes is relevant, the overall phase of the transfor- 
mation can be assumed to be fixed, which reduces the 
group of transformations to SU(2). However, in our case 
the overall phase shift can vary between the consecutive 
temporal slots, and therefore we need to keep it as an in- 
dependent parameter. We note that any U(2) matrix can 
be mapped onto a rotation in the three dimensional phys- 
ical space. Such a rotation describes the corresponding 
transformation of the Poincare sphere used to represent 
the polarization state of light in classical optics . We 
will label elements of U(2) as f2 and use a dot to denote 
the multiplication within the group. The U(2) group has 
a natural invariant integration measure which we assume 
is normalized to one J df2 = 1. This measure defines a 
uniformly randomized distribution of polarization trans- 
formations that scrambles an arbitrary input polarization 
to a completely mixed one. 

Suppose now that two consecutive temporal slots la- 



belled by A and B, each comprising two orthogonal po- 
larizations, are occupied by a joint state of radiation Qab, 
as shown schematically in Fig. ^ We will assume that 
the polarization transformation $7^4 affecting the slot A 
is completely random, but that the transformation il B is 
correlated with the first one through a conditional proba- 
bility distribution p(SIb\^a)- The resulting transforma- 
tion of the joint two-slot state is therefore given by the 
following completely positive map: 

a(q A b) = J dn A J dn B p(n B \n A ) 

xii{n A ) <g> u(n B ) Q AB u\q. a ) <g> u\n B ). 

(9) 

Here U (ft) is a unitary matrix acting in the Hilbert space 
of one of the slots that represents the polarization trans- 
formation J7. We will now assume that the conditional 
probability p(Qb\Qa) depends only on the relative trans- 
formation between the slots A and B and that it can con- 
sequently be represented as p{^l B \Q, A ) — p(fl B ■ ^a)- 
In such a case, we can substitute the integration vari- 
ables in the second integral according to Qb = • 
and make use of the invariance of the integration mea- 
sure dfts = dJY. This procedure shows that the map 
A can be represented as a composition of two maps: 
A = (1 (8 Adep) o Ape r f. The first one of them, A pcr f, 
acts on both the temporal slots and it depolarizes them 
in exactly the same way: 

Aperf (PAS) - J dfl U(fl) <g> U{fl) g AB ® 17* (fl) 

(10) 

The second map, Ad ep , acts only on the slot B, and it 
introduces additional depolarization relative to the slot 
A according to the probability distribution p(ft'): 

Adepts) = J dn' P (n')U{n')g B u\n'). (11) 

We will assume later that the distribution p(fi') has suf- 
ficient symmetry to describe the action of the map Ad ep 
in the relevant Hilbert space with the help of two simple 
parameters. 

We now introduce a further simplification by impos- 
ing a condition that each temporal slot may contain at 
most one photon. Therefore the relevant Hilbert space 
for each slot is spanned by three states: the zero-photon 
state |0), and horizontally and vertically polarized one- 
photon states | <->) and | |). We can conveniently write 
the explicit form of the unitary transformation U(fl) us- 
ing the irreducible unitary representations of the group 
SU(2). We will denote by X> J '(J7) a (2j + 1) x (2j + I) 
matrix that is a (2j + l)-dimensional representation of 
an SU(2) element obtained from f2 by fixing the overall 
phase factor to one. These matrices are well known in 
the quantum theory of angular momentum as describing 
transformations of a spin-j particle under the rotation 
group |l2(. We will also denote by a(fl) the overall phase 
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of the element ft. Then the unitary transformation of the 
input state corresponding to the polarization rotation $7 
is given by the matrix: 

/ D°(0) \ 

*w = [ j «ww>y (i2) 

In this formula, the one-dimensional representation 
P (O) is identically equal to one, and e iQ(n) X> 1/2 (n) is a 
2x2 unitary matrix itself; however, we will keep this more 
general notation in order to be able to use results from 
the theory of group representations. In particular, the 
following property of the rotation matrix elements will 
allow us to evaluate directly a number of expressions: 

J dn[x&„(n)ri£, n ,(n) = -±-5 jf 5 mm ,8 nn ,. (13) 

The action of the map A per f on a joint two-slot state 
can be analyzed most easily if we decompose the com- 



plete Hilbert space into a direct sum of subspaces with 
a fixed number of photons: H = H {0) 9 H (2) , 
where the upper index labels the number of photons. 
The zero-photon subspace is spanned by a single state 
1 0,4 Ob). The one-photon space has a basis formed by 
four vectors: | <-+ A Q B ), | \ A B ), \0 A and 
\0 A |b). Finally, in the two-photon subspace Ti^ we 
will introduce a basis that consists of the singlet state 
I*-) = (| <->a|b) - | Ia<->b))/V2 and the three triplet 
states | ^ a ^b), |*+) = (| ^aU) + \ U^b))/V2, 
and | \) A 1)b- The reason for this choice is that then 
the action of the tensor product X> 1/2 (f2) ® X> 1/2 (fJ) on 
a two-photon state can be decomposed into the sum: 
V 1 / 2 ^) (g) P 1 / 2 (fJ) = V°(fl) 8 V^fl) where V°(n) 
acts on the singlet state and 2? 1 (f2) is a three- 

dimensional matrix acting in the triplet subspace. Using 
our decomposition of the complete Hilbert space, the ac- 
tion of the tensor product U(fl) ® U(Sl) on a general 
two-slot state in the basis specified above is given by: 



©V2(n) 
o o 








p!/2(0) 



2ta(n) 



/ z>°(n) o o o \ 
o 

V^fl) 

V o / 



(14) 



If we now insert this fomula into Eq. (|10|) , it can be easily 
seen that the invariant integration over the overall phase 
factor a(n) kills all the off-block diagonal elements of 
the density matrix that link different subspaces Ti^ k K In 
other words, all the coherence between states with differ- 
ent photon numbers is completely removed by the phase 
fluctuations. Furthermore, the operation Ad cp , acting 
only on the second slot, does not mix subspaces with dif- 
ferent photon numbers. Therefore the conditions of our 
lemma are satisfied and we can consider only states with 
a definite number of photons as elements of the input 
ensemble. Thus we need to calculate are three corre- 
sponding Holevo quantities x^°\ X ■> an d thai can 
be combined into a Holevo bound for the overall channel 
capacity according to Eq. ijHJ). This calculation forms the 
contents of the next section. 



IV. CHANNEL CAPACITY 

The communication capacity of the zero-photon 
subspace itself Tt^ is naturally zero, as we have only 
a single state |0a0b) at our disposal. This state can of 
course be used as an element of a larger ensemble thus 
contributing to the overall capacity. This fact is reflected 
in the form of Eq. ©, where x^ =0 indeed does in- 
crease the total value of x- 



A. One-photon subspace 

A less trivial problem to calculate is the capacity of the 
one-photon subspace. If we assume a normalized input 
state Qi n from the subspace Jv- l > , then the action of the 
channel A pcr f restricted to this subspace is given by: 



/ a 


b* 

V o 





a 


b* 



b 


— a 
1 



(15) 



where the parameters a and b are defined in terms if the 
input density matrix as: 

a = (<-> A s |ft n | ^ A Ob) + Q.A B |gm| \a Ob) 

b = (" A B \Qin\0 A "B) + (U0B\Qm\0 A lB) (16) 

For the form of density matrix given in Eq. (Tg), the 
depolarizing channel Ad op affects only the off-diagonal 
elements b and 6*. We will assume that the symmetry 
of the distribution p(fi') is such that the effect of A^ep 
is a rescaling of these elements by a real parameter 7/ 
bounded between and 1. It is now easy to check that 
the entropy of the one-photon state emerging from the 
channel can be written as 



S(A(g ia )) = l + S( 



rfb* 



rfb 

l-a 



(17) 
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where the 2x2 matrix appearing in the second term 
can be interpreted as a state of a qubit. Therefore, the 
second term is bounded by and 1, and consequently 
1 < S'(A(gi n )) < 2. It is a straightforward observation 
that the Holevo quantity is bound from above by the dif- 
ference between the maximum and the minimum possible 
entropies of states emerging from the channel. Therefore 
we obtain that < 1- This inequality can be saturated 
simply by taking a one-photon state confined either to the 
first or to the second temporal slot, with an arbitrary po- 
larization. Thus, the channel capacity is not enhanced in 
the one-photon sector. 



B. Two-photon subspace 

The most interesting regime is when both the temporal 
slots are occupied by photons. As we will see below, 
in this case quantum correlations can then enhance the 
capacity of the channel. If we take a normalized input 
state gin from the two-photon subspace TL^ 2 \ then the 
map Aperf produces a Werner state [13j : 



A£f(&n) 



and this is the only condition if we consider the most gen- 
eral, possibly entangled input states. However, if the in- 
put states are restricted to separable ones, then as shown 
by Horodeccy the allowed range for the parameter 
c is reduced to 



-1/3 < c < 1/3. 



(23) 



This limitation will underlie the reduced channel capacity 
in the case of separable states. 

As the two-photon states emerging from the channel 
are fully characterized by the Werner parameters of the 
respective input states, optimization of the Holevo quan- 
tity can be carried out over the ensemble {qj',Cj} of the 
probabilities qj of sending the jth state with the Werner 
parameter equal to —Cj. The output states emerging 
from the channel is therefore given by an ensemble of 
Werner states {qj; W VCj }- Because a statistical mixture 
of Werner states is also a Werner state with the average 
parameter: 



(24) 



where we have introduced the following notation: 



(18) 

the Holevo quantity can be expressed with the help of a 
single real- valued function /(c): 



W c 



-c|*_)<*_| + (l + c) 



1 

C, 4 



(19) 



and we will use for c the name of the Werner parameter 
of the input state gj n , defined as: 



(20) 



This result, derived previously in Ref. [l3T |. can be verified 
independently using the property given in Eq. <|13[) . 

The second operation affecting the input state is the 
partially depolarizing channel 1 (g> Ad ep - We will assume 
that the action of the map Ad ep acting on the photon in 
the second temporal slot is simply isotropic depolariza- 
tion shrinking the length of the Bloch vector by a factor 
77 satisfying < 77 < 1. Such an operation preserves the 
Werner form of the transmitted state, and its only effect 
is the multiplication of the parameter c by the factor 77. 
Thus, the state emerging from the channel is given by: 



A (2)( 



Qu 



w„ 



(21) 



with the parameter c defined by the input state gi n ac- 
cording to Eq. l|2"U)l. 

At this point the possibility of enhanced communica- 
tion capacity by exploiting entanglement manifests itself. 
The difference between the separable and entangled al- 
phabets can be seen by comparing the allowed ranges 
of the parameter c. The positivity of the input density 
matrix g m requires that 



1 < c < 1/3 



(22) 



.(2) 



(25) 



where the explicit form of the function /(c) is given by: 
/(c) = 2-j(l+c)log 2 (l+ C )-j(l-3c)log 2 (l-3c). (26) 

The optimization of the Holevo quantity, which in princi- 
ple needs to be performed over an arbitrarily large input 
ensemble of permitted quantum states, can be greatly 
simplified using the following observation. 

Lemma 2: Let f(j) be a concave function defined on 
a closed interval [a, /3], and let qj be a probability distri- 
bution for a set jj of real numbers taken from the range 
a < 7j < (3. Then the following inequality holds: 



7 — a 



< sup (fW-Ljlf(a) , 
q<7</3 V P — ot p — a 



fW ■ (27) 



Proof: The concavity of the function /(c) implies that 
for every j we have: 



/(7i)>4-^/(«) + i-4/o<>- 



j3 — a 



(3 -a 



(28) 
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If we now multiply the above equation by —qj, perform 
the summation over j, and add a term J^j f iljlj) to both 
sides of the equation, we will obtain an inequality whose 
left hand side is identical with that of Eq. (|27[) . and the 
right hand side is exactly the argument of the supremum 
for 7 = qj^/j . Obviously, this value of 7 lies between a 
and j3, and consequently the supremum may only exceed 
the value obtained from this calculation. This confirms 
that Eq. i(T7|) is indeed satisfied. 

The above lemma reduces the whole problem of opti- 
mizing the Holevo bound to maximizing a one-parameter 
real-valued function that is the argument of the supre- 
mum on the right hand side of Eq. J57J. Inserting the 
explicit form of the function /(j) given in Eq. I|26|l and 
differentiating the resulting expression over 7 shows that 
the supremum in the right hand side of Eq. I|27[l is at- 
tained for 



7opt 



1 - 2 4 ^/ 3 
3 + 2 4 ^/ 3 



(29) 



where fx = [/(/?) -/(«)]/(/?- a). 

As we have seen, the permitted range of the parameters 
Cj characterizing the states belonging to the input ensem- 
ble depends on whether we allow most general, possibly 
entangled states, or rather restrict the input to separable 
states only. If we assume that this range spans from c m i n 
to 



< Cj < c n 



(30) 



then we can easily apply Lemma 2 to the expression of the 
Holevo quantity \^ i n terms of the function /(c) that 
has been given in the second line of Eq. I|25|) . Taking 
ct = rjc mm and f3 — ?7C max and using the explicit value of 
the turning point derived in Eq. I|29|l yields the following 
bound: 

X {2) < log 2 (3+2 4 ^ 3 )-/(r,c min )+ M (r/c min -l/3)-2 (31) 

where fi is given in terms of the input ensemble charac- 
teristics as: 



/(r?C max ) - /(77C min ) 



(32) 



max ^mm I 



We will analyze in detail numerical values of the channel 
capacity in the next section. Before doing so, we will 
close this section by describing a simple intuitive picture 
of Lemma 2 that gives an additional insight into the form 
of the input ensemble. 



C. Graphical interpretation 

The result of Lemma 2 can be visualized using the fol- 
lowing geometrical reasoning depicted in Fig. [21 Consider 
a graph of the function f("j) versus its argument 7. The 
numbers 7j and the corresponding values of the function 
/ are given by a set of points Gj = (7j,/(7j)) in the 
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FIG. 2: The graphical representation the maximization pro- 
cedure for the two-photon subspace. The set of points Gj cor- 
responds to the output ensemble. The difference /(7) — g("f) 
over 7 needs to be maximized over the interval [a, jS]. 



plane of the graph. The probability distribution qj for 
the arguments 7j defines an average 



(33) 



that can be interpreted as a center of gravity for the 
system of points Gj that have been assigned respective 
masses qj. Obviously, if the probability distribution is 
arbitrary, then this average can lie anywhere within the 
convex polygon spanned by the points Gj. Since the 
function / is strictly concave over the range considered, 
the whole polygon lies within the area bounded by the 
graph of the function 7(7) on one side, and a straight 
line connecting the points (a, ,f(a)) and (/3, /(/3)) on the 
other side. This straight line is given by a function g 
defined as: 



5(7) 



1-7 
(3 — a 



/(«) 



7 — a 

[3 — a 



/(/?)• 



(34) 



The left hand side of Eq. I|27|l is now given by the 
length of a vertical line connecting G with the point 
H' = (7, f(j)) on the graph of the function f("f), where 
7 = Ylj Qj7j- Clearly, the line GH' will be always equal 
in length or shorter than the line H'H" where the point 
H" = (7,5(7)) lies on the graph of the function (7(7). 
Furthermore, in order to find the maximum possible 
length of the line H'H", it is clear from this geomet- 
ric construction that we need to maximize the difference 
f(l) ~ 5(7) over 7 belonging to the interval [a,/?]. This 
procedure is expressed explicitly in the right hand side of 
Eq. (|27l) and the parameter fi introduced in the previous 
subsection is simply the gradient of the function 3(7). 

It is clearly seen from this geometric construction that 
enlarging the interval [a, 0\ can only increase the value 
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FIG. 3: Depiction of the optimal ensembles that maximize the 
Holevo quantity in both the general entangled case and the 
restricted separable case, for perfectly correlated noise (r) — 
1). It is sufficient to take only two-element ensembles with the 
extreme points of the allowed interval. For general entangled 
states the interval is [—1, 1/3] whereas for the separable case 
the interval is reduced to [—1/3, 1/3] 

of the upper bound given in Eq. Q27[l . This implies two 
rather straightforward observations. First, the use of en- 
tangled states should give a larger capacity compared to 
separable states. Secondly, a lower value of the param- 
eter r\ meaning weaker correlations between consecutive 
polarization rotations results in a decreased channel ca- 
pacity. 

The graphical construction presented above also gives 
a simple recipe for constructing an output ensemble that 
saturates the bound on the Holevo quantity. It is suffi- 
cient to take a two-element ensemble with the extreme 
points of the allowed interval as the parameters of the 
Werner states emerging from the channel: a = r]c m i n 
and (3 = r]c ma , x . The optimal probabilities of using the 
two states need to be selected in such a way that the 
weighted sum of the points corresponding to these states 
gives the point 7 opt maximizing the difference f("f)— g("f)- 
Explicitly, these probabilities are respectively given by 
(P - 7o P t)/(/3 - a) and (7 opt - a)/(fl - a). The actual 
graph of the function f("Y) with the permitted ranges of 
the Werner parameter for perfectly correlated noise and 
entangled and separable inputs is shown in Fig. [3J 

V. ATTAINABILITY AND IMPLEMENTATION 

The Holevo quantity x is only an upper bound on the 
channel capacity and therefore is not necessarily attain- 
able. Users of a communication channel need two rele- 
vant pieces of information. The first one is the optimal 
form of the input ensemble that should be used by the 
sender. The second one is a measurement scheme that 
should be employed at the output of the channel in order 



to optimize the capacity. 

Let us start by summarizing the results of the preced- 
ing section and specifying the input ensemble implied by 
these considerations. We have seen that in the zero- and 
one-photon subspaces the channel capacity cannot be en- 
hanced by exploiting the polarization degree of freedom. 
Therefore as the elements of the input ensemble we can 
take for example states \0a0b), | Ia 0b), and \0a 
where for concrctcncss wc have fixed the polarization of 
single-photon states to vertical. The polarization degree 
of freedom starts to play a nontrivial role when both the 
temporal slots are occupied by photons. In this subspace, 
we need to select two input states characterized by the 
Werner parameters that are as distant as it is allowed 
by the constraints on the input ensemble. If we restrict 
ourselves to separable states, then according to Eq. I)23|) 
we need to take one separable state with c m - m = —1/3 
and another one with c max = 1/3. It is easy to verify 
using Eq. I|2l)|l that the pair of separable states satisfy- 
ing this condition can be taken as | Xa<->b) and | t^ts)- 
We thus see that in agreement with the simple picture 
developed in the introduction to this paper, the relevant 
quantity is the relative polarization of the photons occu- 
pying consecutive slots. If we allow for entangled input, 
then the lower limit for the Werner parameters of the in- 
put states shifts down to c m i n = — 1. This value can be 
of course attained by taking the singlet state \^~) itself 
as one element of the input ensemble, and any state with 
Cmax = 1/3, for example again | JaIb) as the second one. 

In order to complete the description of the communi- 
cation protocol, we need to specify the measurement ap- 
plied to the states emerging from the channel. This task 
can be decomposed into two steps. The first one is the de- 
termination of the total number of photons contained in 
the two slots and it can in principle be accomplished by 
a collective quantum non-demolition measurement [l5j 
on all the modes involved that would determine the to- 
tal photon number without destroying coherence between 
the modes. Depending on the outcome, the second step 
needs to be either finding the temporal slot occupied by 
a photon in the one-photon subspace which can be real- 
ized by direct temporally resolved detection, or discrim- 
inating between the states used to encode information 
in the two-photon subspace. It is easy to see that this 
discrimination takes a simple form in the case of per- 
fectly correlated noise and entangled input states: we 
need to determine whether the received states belong to 
the singlet or the triplet subspace, which corresponds to 
a two-element projective measurement: 

6s = |*->(*-| 

6 T = 1- |*-)(*-| (35) 

It turns out that the same measurement saturates the 
Holevo bound also in the general case of any value of 
the parameter rj with either entangled or separable in- 
put states. In Fig. 0Ja) we depict conditional probabili- 
ties of obtaining the singlet or the triplet outcomes for a 



8 




3(1 + c max )/4 



(6) 




UXb) 



FIG. 4: Depiction of the outcomes of operator measurements 
Os and Ot- The general case is shown in (a). For per- 
fectly correlated noise, when the full range of allowed entan- 
gled states is employed, perfect distinguishability between the 
two inputs is possible as shown in (b). In the restricted sep- 
arable states only regime, the diagram reduces to that shown 
in (c) and the emerging states are unable to be distinguished 
unambiguously. 
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FIG. 5: Graph showing plot of x versus rj. The channel ca- 
pacity for the general case where entangled states are used is 
significantly greater than for the restricted case where only 
separable states are employed. The dashed line is the channel 
capacity when the polarization degree of freedom is not used 
at all. 



[i6|. as we do not have to distinguish between all four 
Bell states. After overlapping temporally the received 
photons and interfering them on a 50:50 non-polarizing 
beam splitter, their detection in the same output port 
corresponds to a projection onto the triplet subspace, 
whereas measuring them in the separate output ports of 
the beam splitter identifies the singlet state. 



VI. CONCLUSIONS 



two-element input ensemble characterized by Werner pa- 
rameters c m i n and c max - A lengthy but straightforward 
calculation shows that if we take as the input probabil- 
ities the values discussed in the preceding section, the 
mutual information is given exactly by the right hand 
side of Eq. (|31(l . Thus the described procedure indeed 
maximizes the channel capacity in the two-photon sub- 
space. 

It is instructive to compare the above diagram for opti- 
mal entangled and separable input ensembles in the case 
of perfect correlations rj = 1. For the optimal entangled 
ensemble, shown in Fig.^Jb) we can distinguish perfectly 
between the two inputs as they belong to orthogonal sub- 
spaces even after the transmission. For the separable en- 
semble, the emerging states can no longer be perfectly 
discriminated as seen in Fig. 0Jc) . 

The complete channel capacity obtained by combining 
Eq. ||SJ| with the results of Sec. II VI is shown as a function 
of t] in Fig. [5] It is seen that using an entangled input 
ensemble gives a clear advantage over the separable states 
over the complete range of the correlation parameter rj. 

We note that the measurement discriminating between 
the singlet and the triplet subspaces can be implemented 
using the Braunstein-Mann scheme based on linear optics 



We have introduced a model of a communication chan- 
nel with correlated noise motivated by random birefrin- 
gence fluctuations in a fiber optic link. Within this 
model, we have demonstrated that introducing quantum 
correlations between consecutive uses of the channel in- 
creases its capacity. This demonstrates how specifically 
quantum phenomena such as entanglement can be helpful 
in the task of transferring classical information. Making 
use of entanglement requires more complex preparation 
procedures that provide joint input states extending over 
a number of temporal slots. A related question is the 
role of collective quantum measurements on the output 
of the channel rather than detecting radiation in each of 
the slots individually and combining classical outcomes 
of separate measurements. 

The action of the channel has been defined in terms 
of transformations of the bosonic field operators. This 
opens up a route towards interesting generalizations of 
the present work, for example including arbitrary multi- 
photon states. Another direction would be extending the 
model to an arbitrary number of temporal slots rather 
than just allowing for correlations between pairs of con- 
secutive slots as in our example. It is easy to give a simple 
protocol showing that in this case the channel capacity 
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can be enhanced even further. Suppose that the sender 
generates a train of zero- and one-photon states with the 
same probabilities equal to one half. The first time she 
is to transmit a photon, she sends half of maximally en- 
tangled pair. In the second instance when a one photon 
should be transmitted, she sends the remaining member 
of the pair transforming it in such a way that the joint 
two-photon polarization state belongs either to the sin- 
glet or the triplet subspace. The receiver implements a 
polarization-independent quantum non-demolition mea- 
surement on each temporal slot. When a photon is de- 
tected, it needs to be stored until the arrival of the second 
member of a pair, when the discrimination between the 
singlet and the triplet subspaces can be performed with 



the help of a joint measurement. If the fluctuations in 
random birefringence can be neglected over the temporal 
separation between the photons in a pair, this procedure 
allows one to encode one extra bit of information into 
each pair of transmitted photons. This gives the average 
channel capacity equal to 2.5 per a pair of temporal slots, 
enhancing further the optimal value shown in Fig. 
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